Outlier preservation by dimensionality reduction techniques

نویسنده

  • Martijn Onderwater
چکیده

Sensors are increasingly part of our daily lives: motion detection, lighting control, and energy consumption all rely on sensors. Combining this information into, for instance, simple and comprehensive graphs can be quite challenging. Dimensionality reduction is often used to address this problem, by decreasing the number of variables in the data and looking for shorter representations. However, dimensionality reduction is often aimed at normal daily data, and applying it to events deviating from this daily data (so-called outliers) can affect such events negatively. In particular, outliers might go unnoticed.In this paper we show that dimensionality reduction can indeed have a large impact on outliers. To that end we apply three dimensionality reduction techniques to three real-world data sets, and inspect how well they preserve outliers. We use several performance measures to show how well these techniques are capable of preserving outliers, and we discuss the results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Iterative Embedding with Robust Correction using Feedback of Error Observed

Nonlinear dimensionality reduction techniques of today are highly sensitive to outliers. Almost all of them are spectral methods and differ from each other over their treatment of the notion of neighborhood similarities computed amongst the high-dimensional input data points. These techniques aim to preserve the notion of this similarity structure in the low-dimensional output. The presence of ...

متن کامل

2D Dimensionality Reduction Methods without Loss

In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...

متن کامل

Similarity Consideration for Visualization and Manifold Geometry Preservation

Manifold learning techniques are used to preserve the original geometry of dataset after reduction by preserving the distance among data points. MDS (Multidimensional Scaling), ISOMAP (Isometric Feature Mapping), LLE (Locally Linear Embedding) are some of the geometrical structure preserving dimension reduction methods. In this paper, we have compared MDS and ISOMAP and considered similarity as...

متن کامل

Dimensionality Reduction with Subspace Structure Preservation

Modeling data as being sampled from a union of independent subspaces has been widely applied to a number of real world applications. However, dimensionality reduction approaches that theoretically preserve this independence assumption have not been well studied. Our key contribution is to show that 2K projection vectors are sufficient for the independence preservation of any K class data sample...

متن کامل

Spatial Distance Preservation based Methods for Non- Linear Dimensionality Reduction

The preservation of the pairwise distances measured in a data set ensures that the low dimensional embedding inherits the main geometric properties of the data like the local neighborhood relationships. In this paper, distance preserving technique namely, Sammons nonlinear mapping (Sammon‟s NLM) and Curvilinear Component Analysis (CCA) have been discussed and compared for non-linear dimensional...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJDATS

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2015